Identifying transcription factor–DNA interactions using machine learning
نویسندگان
چکیده
Abstract Machine learning approaches have been applied to identify transcription factor (TF)–DNA interaction important for gene regulation and expression. However, due the enormous search space of genome, it is challenging build models capable surveying entire reference genomes, especially in species where were not trained. In this study, we surveyed a variety methods classification epigenomics data an attempt improve detection 12 members auxin response (ARF)-binding DNAs from maize soybean as assessed by DNA Affinity Purification sequencing (DAP-seq). We used prediction minimizing genome only unmethylated regions (UMRs). For identification DAP-seq-binding events within UMRs, achieved 78.72 % accuracy rate across ARFs on average encoding with count vectorization k-mer logistic regression classifier up-sampling feature selection. Importantly, selection helps uncover known potentially novel ARF-binding motifs. This demonstrates independent method TF-binding sites. Finally, tested model built DAP-seq directly found high false-negative rates, which accounted more than 40 ARF TFs tested. The findings study suggest potential use various predict TF–DNA interactions between varying degrees success.
منابع مشابه
Identifying Publication Types Using Machine Learning
Every year the number of journals and the number of articles to be indexed grows at the U.S. National Library of Medicine (NLM) causing an ever increasing demand on the highly qualified, but, relatively small, dedicated staff of indexers. We present a methodology for identifying MeSH (Medical Subject Headings) Publication Types for assisting the indexers in the categorization of these MEDLINE c...
متن کاملIdentifying Students' Inquiry Planning Using Machine Learning
This research investigates the detection of student meta-cognitive planning processes in real-time using log tracing techniques. We use fine and coarse-grained data distillation, in combination with coarse-grained text replay coding, in order to develop detectors for students’ planning of experiments in Science Assistments, an assessment and tutoring system for scientific inquiry. The goal is t...
متن کاملAcoustic Event Detection Using Machine Learning: Identifying Train Events
Light-rail systems are becoming more popular in cities and urban residential areas around the country. One of the main environmental impacts from light-rail systems is noise from the trains as they pass through residential areas. In response to increasing noise complaints, it is becoming more common to perform noise measurements in the residential areas and attempt to identify noise mitigation ...
متن کاملIdentifying incipient dementia individuals using machine learning and amyloid imaging
Identifying individuals destined to develop Alzheimer's dementia within time frames acceptable for clinical trials constitutes an important challenge to design studies to test emerging disease-modifying therapies. Although amyloid-β protein is the core pathologic feature of Alzheimer's disease, biomarkers of neuronal degeneration are the only ones believed to provide satisfactory predictions of...
متن کاملIdentifying dementia in MRI scans using machine learning
A support vector machine and naive Bayes classifier are used to identify the presence of dementia in MRI brain scans. Features are computed for each scan: scans are processed via K-means to segment the image into different tissue types, from which different quantities are computed (e.g., total gray matter, a measure of symmetry); principal component analysis is used to reduce dimensionality of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: in silico plants
سال: 2022
ISSN: ['2517-5025']
DOI: https://doi.org/10.1093/insilicoplants/diac014